home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-09-29 | 46.7 KB | 1,043 lines |
- On Synchronizing DARTnet Clocks to the Millisecond - a Report
-
- David L. Mills
- University of Delaware
- 3 February 1991
-
- 1. Introduction
-
- This is a status report on the hunt toward synchronizing DARTnet clocks
- reliably to the millisecond. Included in this report are the results of
- a preliminary investigation into the timekeeping quality expected for
- DARTnet experiments. The investigation suggests that DARTnet routers can
- attain and sustain accuracies to the order of a millisecond and
- stabilities to the order of a millisecond per day. It also points out
- certain routing problems that can seriously undermine the confidence
- that this level of performance can be guaranteed for all experiments.
- This should be a high priority problem to resolve. Also in this report
- is a description of the fuzzball clock statistics mechanism together
- with file formats and data interpretation which can be used to calibrate
- experimental results after the fact in order to realize the ultimate
- timekeeping accuracy.
-
- All operating DARTnet routers are now running NTP Version 2 with each
- other and with two NTP stratum-1 (primary) time servers, wwvb.isi.edu at
- ISI and dcn5.udel.edu at UDel. The fuzzballs are running NTP Version 3
- as described in the latest revision of the protocol specification found
- in the PostScript file pub/ntp/ntpv3.ps.Z on louie.udel.edu. While
- Version 3 is backwards-compatible with Version 2, some minor differences
- in status reporting exist. The result of this is that the Unix program
- ntpq, ordinarily used to probe NTP time servers for status, operates
- slightly differently with the Unix xntpd and fuzzball servers. (The main
- difference is that the tattletale indicator reported just after the peer
- ID is not always meaningful for the fuzzball.)
-
- The investigation also happened to reveal certain design problems with
- the fuzzball time servers used by DARTnet to synchronize local clocks to
- Coordinated Universal Time (UTC). These problems were due to the
- increasing demands placed upon the primary servers by a rapidly
- escalating cadre of users and involved relatively high timing jitter due
- to interrupt latencies, queueing delays and so forth. These problems
- have largely been alleviated by subtle and intricate jinks in the code;
- however, the problem remains that time service is suddenly getting very
- popular. To that end I have incorporated access controls in the UDel
- time servers designed to help hold down incidences of casual abuse, such
- as many users from the same network ganging up on the same server. I
- intend to update ISI and the remaining fuzzball time servers shortly. A
- message to that effect has been sent to the NTP interest list along with
- guidelines that should help reduce friendly fire casualties in future.
-
- 2. Retrospective Timekeeping Data
-
- I have configured a dedicated fuzzball dcn2.udel.edu (actually, one that
- is to be shipped to MIT when the NEARnet NSS is installed there). This
- machine is internally synchronized to the UDel master clocks and
- continuously watches every interface on every DARTnet router, as well as
- selected fuzzball time servers. The data are reported to a file which
- can be retrieved via FTP. For most purposes this statistical luxury can
- be ignored, since I intend to watch the clocks as part of my regular
- snoop rounds. However, for those experiments where timestamps reliable
- to the millisecond are required, experimenters may need to calibrate
- extracted timestamps using retrospective offset measurements available
- in this file.
-
- The binary data file is called stats.dat and is accessible via login
- tcp-test, password cerf (in the finest historical tradition). It
- consists of records of eight 16-bit words in little-endian, binary
- representation. The records are in the format:
-
- +------+------+------+------+------+------+------+------+
- | Day | ToD | Code | Offset |Delay |Disper|
- +------+------+------+------+------+------+------+------+
- word 0 1 2 3 4 5 6 7
-
- Day NTP day, where day 0 is 1 January 1900; to obtain Modified
- Julian Day (MJD), add 41317 to this number
-
- ToD time of day relative to UTC midnight (milliseconds), high-
- order 32 bits followed by low-order 32 bits
-
- Code two-octet code, coded as follows
-
- 0-7 high-order octet of the peer status word described in
- Appendix B of RFC-1119 and ntpv3.ps, coded as follows
-
- 0 configured entry
- 1 peer authentication enabled
- 2 peer correctly authenticated (truechimer)
- 3 peer reachable
- 5-7 peer selection status, coded as follows (the
- character shown following the code is the tattletale
- indicator shown by the fuzzball monitoring program;
- however, the ntpq monitoring program may tattle
- different characters)
-
- 0 rejected
- 1 x passed sanity checks (tests 1 through 8 in
- Section 3.4.3)
- 2 . passed correctness checks (intersection
- algorithm in Section 4.2.1)
- 3 - passed candidate checks (only the first 10
- peers on the candidate list are considered for
- clock selection in the fuzzball implementation)
- 4 + passed outlyer checks (clustering algorithm in
- Section 4.2.2)
- 5 # current synchronization source; max distance
- exceeded (in the fuzzball implementation this
- means the synchronization distance exceeds 1000
- ms, normally considered real raunchy time)
- 6 * current synchronization source; max distance
- okay
- 7 reserved
-
- Note that these codes are cumulative; a status of n
- implies passing all tests at level n, together with
- all tests at levels less than n
-
- 8-15 peer ID
-
- Offset signed offset of peer clock relative to local clock
- (milliseconds), high-order 32 bits followed by low-order 32
- bits
-
- Delay signed roundtrip delay via peer clock to the root of the
- synchronization subnet (milliseconds); note: under some
- conditions this number can be negative
-
- Disper unsigned dispersion via peer clock to the root of the
- synchronization subnet (milliseconds)
-
- Note that the file is fixed size, normally 512K bytes, and is zero-
- padded at the end. Ordinarily, this provides for about a month of data
- recording, following which I rename the old file and create a new one.
- At present, there is no facility for automatically creating a new file
- when the old one fills up. Trust me.
-
- 3. How Goes the Tick
-
- In order to assess the accuracy and robustness of NTP-synchronized
- clocks, I use a special-purpose simulator that processes the statistical
- data and closely mimics the operation of the NTP daemon on a real
- machine. The program reads a file such as produced by the handy-dandy
- ASCII conversion routine described in Appendix A and produces a file
- showing the resulting behavior of the filtering, selecting, combining
- and phase-lock-loop algorithms. An example of the data input to this
- program is shown below (all numbers are in milliseconds, except the day:
- MJD and the Code: hex).
-
- Day ToD Code Offset Delay Disper
- -------------------------------------------
- 48289 79369 6115 -4 39 12
- 48289 106624 4204 -5 63 4
- 48289 119282 120a -13564 113 3623
- 48289 185707 120a -13129 115 3623
- 48289 213995 6115 -3 38 12
- 48289 217210 420e -6 116 27
- 48289 223560 6204 -5 62 3
- 48289 244875 2216 -6 192 27
- 48289 253305 130a 1182 153 2043
- 48289 276211 2208 -6 116 27
- 48289 277278 4209 -6 117 27
- 48289 279379 220b -6 118 27
- 48289 315930 3113 -86 262 11
- 48289 318826 4206 -4 65 3
- 48289 319932 130a 819 154 2043
- 48289 348936 3115 -3 38 11
- 48289 358368 6204 -5 63 3
- 48289 386798 130a 578 153 2043
- 48289 454050 130a 574 154 2043
- 48289 484507 3115 -3 38 11
- 48289 493956 6204 -5 63 3
- 48289 589423 130a 585 153 2043
- 48289 619696 3115 -3 37 11
- 48289 629130 6204 -5 63 3
- 48289 763976 6204 -5 63 3
- 48289 774498 4207 -6 63 3
- 48289 787098 2216 -7 195 42
- 48289 858592 1113 -33 159 12
- 48289 859525 120a -172 113 71
- 48289 889750 6115 -4 41 11
- 48289 898115 4204 -5 64 3
- 48289 911749 420f -5 138 13
- 48289 912807 3114 -10 48 2
- 48289 1016709 120d -28 132 16
- 48289 1023919 6115 -4 38 11
- 48289 1032084 4204 -5 62 3
- 48289 1056335 2216 -8 190 46
-
- The last two hex digits of the Code indicate the peer ID, from which we
- learn that peer 0x0a is in awful shape, for what cause I do not know.
- Host 0x13 happens to be ISI fuzzball, shows a systematic offset of --33
- ms due to a well known asymmetric path one way via DARTnet and the other
- way via NSFnet. Note that these are unfiltered data; things look much
- better after the NTP algorithms get to work.
-
- The following table is a snapshot taken from dcn2.udel.edu for every
- interface of every DARTnet router, along with fuzzball primary servers
- wwvb.isi.edu [128.9.2.129], dcn1.udel.edu [128.4.0.1] and dcn5.udel.edu
- [128.4.0.5], as well as a selected fuzzball secondary server
- clock.sura.net [192.80.214.42]. The data show the peer ID, IP address
- and (filtered) delay, offset and weight. The weight, which is used by
- the clock-combining algorithm (adapted from that used by NIST to produce
- the U.S. standard timescale from a bunch of cesium clocks), is computed
- as the quotient 65536 divided by the synchronization distance, itself
- computed as the total dispersion plus one-half the total delay to the
- root of the synchronization subnet, as described in the NTP Version 3
- specification. For this purpose higher values of weight indicate higher
- quality of data, as determined by the algorithm.
-
- ID Address Dly Ofs Wgt
- --------------------------------------
- 4 + [140.173.32.2] 25 -5 1638
- 5 - [140.173.16.1] 19 -6 2114
- 6 + [140.173.16.2] 25 -5 1985
- 7 + [140.173.64.1] 26 -4 1337
- 8 . [140.173.64.2] 78 -13 840
- 9 . [140.173.128.1] 78 -13 840
- 10 x [140.173.128.2] 80 -65 56
- 11 . [140.173.112.1] 78 -13 840
- 12 . [140.173.112.2] 88 -4 885
- 13 + [140.173.96.1] 87 -4 897
- 14 . [140.173.96.2] 77 -14 819
- 15 . [140.173.80.1] 90 -5 840
- 16 + [140.173.80.2] 87 -5 897
- 17 + [140.173.144.1] 88 -5 897
- 18 + [140.173.144.2] 90 -3 885
- 19 x [128.9.2.129] 169 -41 579
- 20 - [128.4.0.1] 39 -7 2340
- 21 * [128.4.0.5] 38 -3 3120
- 22 . [192.80.214.42] 55 -9 736
-
- Following the peer ID is the tattletale indicator, which shows the
- status of that entry as determined by the NTP algorithms. The tattletale
- coding is as described above for the statistics file entry. The entry
- marked "*" is the current synchronization source as determined by NTP;
- however, in the case of this particular fuzzball this source is not used
- to synchronize the machine itself, since a primary reference source is
- available internally. The two entries marked "x" have been discarded by
- the intersection (modified Marzullo) algorithm as suspect falsetickers,
- while the entries marked "." have been discarded from the end of the
- surviving candidate list (the fuzzballs keep a maximum of ten of the
- lowest-distance candidates on that list). The entries marked "-" have
- been discarded by the clustering (Mills) algorithm, while the remaining
- entries marked "+" have survived all hazards and are therefore eligible
- for processed by the weighted-average (NIST) combining algorithm which
- produces the final local-clock correction (whew).
-
- For comparison and evaluation, the following two tables show the offset,
- delay and dispersion for the last eight measurements made from
- dcn2.udel.edu to wwvb.isi.edu and dcn5.udel.edu. Note that the
- dispersion (due primarily to maximum assumed local-clock frequency
- offset) indicates the approximate age of the sample, in minutes.
-
- wwvb.isi.edu
- Offset -68 -41 -42 -38 -66 -69 -38 -71
- Delay 227 169 174 166 224 227 164 236
- Disper 1 17 34 50 67 83 117 133
-
- dcn5.udel.edu
- Offset -3 -4 -4 -5 -5 -6 -4 -4
- Delay 38 39 38 39 39 44 38 39
- Disper 1 5 9 13 17 22 26 30
- The systematic offset of wwvb.isi.edu is clearly apparent, as is the
- typical NSFnet noise on that the leg of the roundtrip path. The
- dcn5.udel.edu data show low dispersions typical of the quiet DARTnet.
- However, dcn5.udel.edu shows a systematic offset of about -5 ms due to
- subtle delays in the internal clock-synchronization scheme used locally.
- When we get all our wires uncrossed here, that curiosity should
- disappear. Nevertheless, what experimenters usually need is a
- calibration of experiment host and router offsets between each other, in
- order to determine a common reference timescale. In principle, it
- doesn't matter what the systematic offsets are, just that they can be
- determined retrospectively from the statistics file.
-
- 4. How Tight the Tick
-
- The question yet remains on the accuracy and stability of the particular
- local clock on each experiment host and router. It is not possible to
- observe this directly with the available hardware for other than
- fuzzballs, and, even for fuzzballs requires a precision interval counter
- (such as constructed by one of my students). However, the NTP local-
- clock model is described as an adaptive-parameter, type-II, phase-lock
- loop (PLL). Fortunately, one of the parameters estimated, called the
- compliance (Greek tau in Appendix G of the NTP Version 3 specification),
- is a particularly useful indicator of local-oscillator stability. Large
- negative values indicate an oscillator with stabilities in the range
- better than 1 ms per day, while more positive values indicate other
- oscillators varying over the range up to several seconds per day.
-
- For fuzzballs (only) an indirect indicator of the compliance is the poll
- interval, which is included in all NTP packets. This value, expressed as
- a power of 2 in seconds, is shown in both the Unix and fuzzball
- monitoring displays. In version 3 of the protocol the poll interval is
- tied to the compliance, with effect that the interval varies from about
- a minute for the most unstable oscillator (the power grid) to about 17
- minutes for most stable oscillators (LORAN-C receivers and cesium
- clocks). The displayed value can vary from 6 through 10, with the
- highest number indicating the highest stability. Ordinarily, this tidbit
- would be important only for experiments designed to run for some time
- (hours) while NTP is turned off.
-
- One of the goals of the NTP Version-3 models is to demonstrate
- accuracies to the millisecond, at least in DARTnet-like systems, as well
- as stabilities to the order of a millisecond per day. If these goals can
- be obtained in principle, a DARTnet platform could be synchronized, then
- NTP could be turned off during the experiment itself, if need be, with
- assurance the platform would not wander by more than fractional
- milliseconds during the experiment.
-
- In order to test whether these goals can be met, the simulator
- previously described was used to process the statistics data collected
- from dcn2.udel.edu for the last couple of weeks. A snatch of these data
- covering approximately the same interval used previously to illustrate
- the statistics format is shown below:
-
- Time Offset Freq Delay Dist Clks RefID
- ---------------------------------------------------------
- 20001.3 -5.711 -0.0023 41.0 33.8 8 20
- 20003.6 -5.708 -0.0023 40.0 31.5 6 20
- 20005.8 -5.704 -0.0023 41.0 29.5 6 20
- 20008.0 -5.699 -0.0022 40.0 28.5 6 20
- 20010.3 -5.698 -0.0022 39.0 29.9 6 20
- 20012.5 -5.698 -0.0022 39.0 29.5 6 20
- 20014.8 -5.697 -0.0022 42.0 30.8 6 20
- 20017.0 -5.692 -0.0022 39.0 31.5 10 20
- 20019.3 -5.689 -0.0022 39.0 28.4 10 20
- 20021.5 -5.686 -0.0022 39.0 28.6 10 20
- 20023.8 -5.683 -0.0022 40.0 28.8 10 20
- 20026.1 -5.680 -0.0021 38.0 27.7 10 20
- 20028.3 -5.676 -0.0021 38.0 29.4 10 20
- 20030.6 -5.673 -0.0021 39.0 28.1 10 20
- 20032.8 -5.670 -0.0021 39.0 30.3 10 20
- 20035.0 -5.662 -0.0021 42.0 32.2 10 20
- 20037.3 -5.657 -0.0020 40.0 28.6 10 20
- 20039.5 -5.652 -0.0020 40.0 30.6 10 20
- 20041.8 -5.649 -0.0020 39.0 29.9 10 20
- 20044.0 -5.645 -0.0020 39.0 31.6 10 20
- 20046.3 -5.642 -0.0020 39.0 29.5 10 20
- 20048.5 -5.637 -0.0019 40.0 29.7 10 20
- 20050.7 -5.630 -0.0019 40.0 30.4 9 20
- 20053.0 -5.623 -0.0019 40.0 33.0 9 20
- 20055.2 -5.615 -0.0018 39.0 29.6 9 20
- 20057.5 -5.607 -0.0018 39.0 28.7 9 20
- 20059.7 -5.600 -0.0018 39.0 30.4 9 20
- 20062.0 -5.593 -0.0017 40.0 27.4 9 20
- 20064.2 -5.589 -0.0017 43.0 33.5 9 20
- 20066.5 -5.580 -0.0017 40.0 29.9 10 20
- 20068.7 -5.572 -0.0017 41.0 30.7 10 20
- 20070.9 -5.567 -0.0016 38.0 27.2 7 20
- 20075.4 -5.552 -0.0016 38.0 32.7 9 20
- 20084.5 -5.528 -0.0014 40.0 36.0 9 20
- 20102.6 -5.470 -0.0011 38.0 40.1 9 20
- 20120.7 -5.415 -0.0007 37.0 42.4 9 20
- 20138.9 -5.377 -0.0005 38.0 40.0 9 20
- 20159.1 -5.340 -0.0003 38.0 43.4 8 21
- 20161.3 -5.338 -0.0003 39.0 38.8 8 21
- 20163.6 -5.335 -0.0002 38.0 41.2 8 21
- 20165.8 -5.331 -0.0002 38.0 39.6 8 21
- 20168.1 -5.327 -0.0002 38.0 39.5 8 21
- 20170.3 -5.323 -0.0002 37.0 38.9 8 21
- 20174.8 -5.316 -0.0002 37.0 42.4 8 21
- 20177.1 -5.313 -0.0001 38.0 37.1 8 21
-
- In this table time is shown in minutes past a certain epoch, which is
- not important here, while the offset, delay and distance are shown in
- milliseconds. The frequency column shows the computed stability in
- parts-per-million. The sixth column shows the number of peers remaining
- on the candidate list after the intersection algorithm (10 max), while
- the last column shows the resulting synchronization source selection.
-
- The above data shows that, at least for periods of a couple of hours,
- the "real" time stays within a few hundred microseconds and the
- stability within a few hundred microseconds per day. However, while
- these data are typical of DARTnet behavior, at least for the present
- lightly loaded network and with the currently broken kernel clock code,
- performance is not always this good. Following is a summary of the
- network behavior over about the last couple of weeks.
-
- ID Samples Mean StdDev Max Min
- --------------------------------------------------
- 0 5097 -4.122 1.719 0.983 -7.480
- 4 100 -4.930 7.879 73.000 -7.007
- 5 1608 -4.368 9.055 7.000 -103.000
- 6 1901 -3.967 2.810 19.000 -11.000
- 7 1929 -4.144 2.509 26.000 -11.000
- 8 558 -3.533 4.003 47.000 -26.000
- 9 547 -3.321 5.059 65.000 -25.000
- 10 968 -770.688 1025.602 1182.000-13564.000
- 11 549 -3.390 4.825 69.000 -25.000
- 12 951 -3.656 3.878 72.000 -17.000
- 13 445 -5.972 7.195 80.000 -30.000
- 14 549 -3.435 4.994 74.000 -25.000
- 15 945 -17.458 126.391 45.000 -1249.000
- 16 951 -3.561 4.118 71.000 -17.000
- 17 941 -3.604 3.735 72.000 -17.000
- 18 938 -6.643 29.347 56.000 -570.000
- 19 1017 -40.530 16.283 19.347 -234.000
- 20 2146 -5.368 4.084 27.000 -47.000
- 21 5419 -3.320 3.102 75.000 -24.000
- 22 499 -9.317 11.103 46.000 -76.000
-
- In this table the peer ID corresponds to the fuzzball monitoring table
- described previously, while the statistics are computed from the offset
- data in the statistics file described previously. However, peer ID zero
- corresponds to the output of the simulation program, so represents the
- computed offset of the local platform (dcn2.udel.edu) relative to the
- mean timescale represented by all the DARTnet players. This timescale is
- provably the best clock in the house, with maximum error over the entire
- interval less than about +-3.5 ms relative to the mean. Most of the
- residual error is due to transients created when the machine was
- rebooted. Obviously, something was certainly broken at entry 10 at some
- time over the period of observation and probably at entry 15. The
- systematic offset previously mentioned with entry 19 (wwvb.isi.edu) is
- clearly apparent, but entry 15 doesn't look to good, either.
-
- 5. How Timely the Tick
-
- There is some concern about the time it takes to tune the NTP
- contraption exactly on frequency. The long answer is that it takes days.
- The short answer is the long answer stashed in a file that is read upon
- reboot. The xntpd program presently does this, which considerably
- shortens the "training" interval while the initial transient in setting
- the clock damps out. While the PLL is mildly underdamped and stabilizes
- offset within a few percent in about an hour, large offset surges can
- produce frequency errors that may not completely settle down until a day
- or two have elapsed. Ordinarily, these niceties should not be of
- concern, unless accuracies to the millisecond are required. Rarely do
- the local-clock offsets exceed ten milliseconds even right after reboot.
-
- Since reboot frequencies may be somewhat larger than usual with DARTnet
- activities, some consideration of the initial synchronization procedure
- may help explain what may at first seem odd behavior. When an NTP client
- comes up it starts sending packets to each of its configured buddies.
- Upon receiving a validated reply (eight separate sanity checks plus
- crypto-authentication between fuzzballs - optional with otherballs), the
- offset/delay/dispersion sample is stashed in an eight-stage shift
- register, one for each peer.
-
- As succeeding samples arrive, the filtering, selecting and combining
- engines puff away to digest the data. As the shift registers fill up the
- overall synchronization distance computed from the dispersion data
- ordinarily decreases below a preset threshold, in the case of fuzzballs
- 1000 ms. During all this time the algorithms are doing their best to
- select the best out of possibly very noisy peers and the tattletale
- indicators wiggle as advertised. However, the local clock is not updated
- until the overall distance falls below the preset threshold. At this
- point the PLL kicks in and the local clock begins to hum. During the
- initial interval after reboot one peer may tattle "#" and many (up to
- 10) tattle "+". Ordinarily, after about ten minutesthe bumps and grinds
- should eventually converge with the "#" be replaced by "*" and only a
- few "+".
-
- It is important to note that, during the initial acquisition interval
- and others where the timing noise is exceptionally high, the number of
- peers apparently eligible for synchronization may seem bogusly large.
- The clustering algorithm used to select the best subset from a bunch of
- clocks compares the dispersion of each clock relative to the ensemble of
- all available clocks and tosses outlyers until the ensemble dispersion
- is less than any clock dispersion or the number of clocks hits a lower
- bound, currently 3. Details of this intricate procedure can be found in
- the NTP Version 3 specification, Section 4.2.2.
-
- 6. How Will the Tick
-
- There are several projects that need to be completed before the clock
- assemblage can be considered completely stable. First and most urgently,
- Van Jacobson's fixes for the SunOS timekeeping code must be installed in
- all routers and experiment hosts. Second, the NTP configuration at
- dcn1.udel.edu needs to be augmented to include the experiment hosts as
- the addresses are determined. Note that the watchkeeper and UDel
- experiment host pogo.udel.edu have routes to DARTnet and associated
- local nets, but not the other DCnet hosts or time servers.
-
- A problem remains on how to synchronize the experiment hosts and
- possibly other hosts on the various local nets associated with DARTnet.
- While those hosts that know about DARTnet and its associated nets, the
- ISI and UDel time servers don't know about them. It is probably not a
- good idea to teach these servers about the DARTnet topology, since there
- are numerous other clients of these servers on those local nets and the
- preferred routes to them should not transit DARTnet. My suggested
- solution is to configure each of the experiment hosts to chime time
- exclusively from DARTnet routers. While this results the most accurate
- and stable time synchronization for DARTnet experiments, there may be a
- certain reluctance to surrender the clock to an experimental contraption
- in the first place. I don't have trouble doing that here and would
- welcome discussion on the issue.
-
- Finally, some way must be found for the time server at ISI and the
- watchkeeper at UDel to schmooze with each other via DARTnet. This is
- critical for calibrating the ISI timescale against the UDel timescale
- when some DARTers chime with one and others chime with the other.
- Assuming a second Ethernet interface can be spliced to this beast,
- individual host routes can be set up so that traffic only for these
- hosts will ride the DARTnet rails, while traffic for other hosts on the
- same net will wander the NSFnet swamps. It is probably not a good idea
- to avoid the problem by forcing all platforms to synchronize to only one
- of the two servers, but not the other, since they occasionally suffer
- radio timewarps, are rebooted, lose power, loose connectors and whatnot.
- In fact, the present situation is somewhat dangerous in that only two
- servers with two radios are available.
-
- The preferred way to avoid the timewarp problem is to splice a third
- radio to the UDel watchkeeper, so that it in effect becomes a dedicated
- primary server for DARTnet. I can do that if Steve Casner sends me his
- second WWVB clock as promised, which I can then (fix?) and install on
- dcn2.udel.edu. I am hoping Spectracom will give me one of their new WWVB
- clocks for me to evaluate, which also works. Eventually, the clock and
- the watchkeeper machine itself are to move to MIT; but, in that case, I
- assume that some way can be found to splice it to DARTnet from there.
-
- Appendix A. Handy Programs
-
- I do most of my data-reduction chores on a souped-up i386 with Microsoft
- QuickC. Among the handy-dandy programs used are two gems, one to convert
- the statistics file from binary format to ASCII, and the other to
- simulate the NTP algorithms as driven by the ASCII file. I offer them
- for what it's worth and not necessarily as examples of programming
- virtuosity.
-
- The first program below reads an input file in the format previously
- described and writes its ASCII representation on the output file. It
- also does some error checking to cast out junk that can happen due to
- various kinds of server calamities, like updating software versions. The
- command line consists of the input file name, the output file name and a
- selector. If the selector matches the ID of a peer, statistics for only
- that peer are produced. If the selector is zero, statistics for all
- peers are produced.
-
- The second program is the NTP simulator. It prompts for the name of an
- ASCII statistics file and then cranks the NTP algorithms to produce the
- data shown in the main body of this report. Output goes the control
- terminal or can be redirected to a file. This is the same program from
- which the sample C programs given in Appendix I of the NTP Version 3
- specification were extracted.
-
- /* this program reads a statistics file and converts to ascii */
-
- #include <stdio.h>
- #include <ctype.h>
- #define UTCDAY 41317 /* MJD at 1 Jan 1972 */
-
- long int n1 = 0, n2 = 0;
- FILE *fopen(), *fp_in, *fp_out;
-
- main(int argc, char *argv[]) {
- int n;
-
- if (argc < 4) {
- printf ("usage: infile outfile peer\n"); exit (1);
- }
- if ((fp_in = fopen (argv[1], "rb")) == NULL) {
- printf ("input file does not exist\n"); exit (1);
- }
- fp_out = fopen (argv[2], "w");
- if (sscanf(argv[3], "%i", &n) != 1) {
- printf ("invalid argument\n"); exit (1);
- }
- convert_file (fp_in, fp_out, n);
- fclose (fp_in); fclose (fp_out); return (0);
- }
-
- convert_file (FILE *input, FILE *output, int sel) {
- unsigned int buf[8], i, c, date, daychk, hour, minute;
- unsigned long time;
- float second;
- struct rec { /* record produced by fuzzball */
- unsigned int date;
- unsigned long time;
- unsigned int code;
- long offset;
- int delay;
- int disp;
- } *p;
-
- n1 = 0; n2 = 0; daychk = 0;
- while (!feof(input)) {
- c = fread(buf, 2, 8, input);
- c = buf[1]; buf[1] = buf[2]; buf[2] = c; /* endian */
- c = buf[4]; buf[4] = buf[5]; buf[5] = c;
- p = buf;
- date = (p->date & 0x3fff)+UTCDAY; time = p->time;
- if (daychk == 0) daychk = date;
- if ((date > 50000) || (date < daychk) ||
- (time > 86400000) || (p->code == 0)) continue;
- n1++;
- if ((sel > 0) && (sel != (p->code & 0xff))) continue;
- n2++;
- hour = time/3600000;
- minute = (time%3600000)/60000;
- second = (time%60000)/1000.;
- /* fprintf(output, "%6u%3i:%02i:%06.3f", date,
- hour, minute, second); */
- fprintf(output, "%6u%9lu", date, time);
- fprintf(output, " %04x%12li%6i%6i\n", p->code, p->offset,
- p->delay, p-> disp);
- }
- printf("input %li output %li\n", n1, n2);
- }
-
- /*
- NTP simulator
-
- This program reads sample data from a history file and simulates the
- behavior of the NTP filter, selection and local-clock algorithms.
- */
- #include <stdio.h>
- #include <ctype.h>
- #include <math.h>
- /*
- Parameters
- */
- #define C_SYNC 0x8000 /* synch okay */
- #define NMAX 40 /* max clocks */
- #define FMAX 8 /* max filter size */
- #define DAY 86400. /* one day of seconds */
- #define MAXREACH 8192. /* reachability timeout */
- #define HZ 1000. /* clock rate */
- #define FILTER .5 /* filter weight */
- #define SELECT .75 /* select weight */
- #define MAXSTRAT 15. /* max stratum */
- #define MAXSKEW 1. /* max skew error per MAXAGE */
- #define MAXAGE 86400. /* max clock age */
- #define MAXDISP 16. /* max dispersion */
- #define MAXCLOCK 10 /* max candidate clocks */
- #define MINCLOCK 3 /* min survivor clocks */
- /*
- Function declarations
- */
- void filter(int, double, float, float); /* filter algorithm */
- int dts(void); /* intersection algorithm */
- int select(void); /* selection algorithm */
- float combine(void); /* combining algorithm */
- float distance(int); /* compute synch distance */
- void stat(int, float); /* update statistics */
- void display(void); /* display statistics */
- void reset(void); /* reset statistics */
- void exit(int); /* off the bus */
- /*
- Peer state variables
- */
- float filtp[NMAX][FMAX]; /* filter offset samples */
- float fildp[NMAX][FMAX]; /* filter delay samples */
- float filep[NMAX][FMAX]; /* filter dispersion samples */
- float tp[NMAX]; /* offset */
- float dp[NMAX]; /* delay */
- float ep[NMAX]; /* dispersion */
- float rp[NMAX]; /* last offset */
- double utc[NMAX]; /* last timestamp */
- int cd[NMAX]; /* code */
- int st[NMAX]; /* stratum */
- /*
- System state constants/variables
- */
- float rho = 1./HZ; /* max reading error */
- float phi = MAXSKEW/MAXAGE; /* max skew rate */
- float maxdist = MAXDISP/16.; /* max distance */
- float bot, top; /* confidence interval limits */
- float theta; /* clock offset */
- float delta; /* roundtrip delay */
- float epsil; /* dispersion */
- int nclock; /* survivor clocks */
- int source; /* clock source */
- int n1, n2; /* min/max clock ids */
- /*
- Local clock constants
- */
- double Kf = 4194394.; /* frequency weight */
- double Kg = 256.; /* phase weight */
- double Kh = 8192.; /* compliance weight */
- double Ks = 16.; /* compliance maximum */
- double Kt = 8192.; /* compliance multiplier */
- double sigma = 4.; /* adjustment interval */
- /*
- Local clock variables
- */
- double Vs; /* clock filter output */
- double tau; /* PLL time constant */
- double mu; /* update interval */
- double freq; /* frequency error */
- double phase; /* phase error */
- double comp; /* error estimate */
- double tstamp; /* at the tone */
- /*
- Statistics vector
- */
- double d0[NMAX]; /* sample count */
- double d1[NMAX]; /* sum of samples */
- double d2[NMAX]; /* sum of squares of samples */
- double d3[NMAX]; /* sample maximum */
- double d4[NMAX]; /* sample minimum */
- /*
- Input file format
- */
- unsigned int date; /* day number (MJD) */
- unsigned long timex; /* time of day (ms past midnight) */
- unsigned int code; /* entry code */
- float offset; /* clock offset (ms) */
- float delay; /* total delay (ms) */
- float disp; /* total dispersion (ms) */
- /*
- Temporary lists
- */
- float list[3*NMAX]; /* temporary list */
- int index[3*NMAX]; /* index list */
- /*
- Stuff
- */
- unsigned int datey; /* starting date */
- FILE *fopen(), *fp_in; /* input file handle */
- char fn_in[50]; /* input file name */
- float t, u, v, w, x; /* float temps */
- int i, j, k, m, n; /* int temps */
-
- main() {
- printf ("input file: "); scanf ("%s", fn_in);
- fp_in = fopen (fn_in, "r");
- if (fp_in == NULL) {
- printf ("input file %s does not exist\n", fn_in);
- exit (1);
- }
- n1 = 1; n2 = 25;
- reset();
- printf(" Time Offset Freq Delay Disp Dist Clk
- Src\n");
- /*
- Read next sample
- */
- while (fscanf(fp_in, "%i%lu%x%f%f%f",
- &date, &timex, &code, &offset, &delay, &disp) != EOF) {
- timex = timex/1e3; offset = offset/1e3;
- delay = delay/1e3; disp = disp/1e3;
- if (datey <= 0) datey = date;
- tstamp = (date-datey)*DAY+timex;
- if (utc[0] <= 0) utc[0] = tstamp;
- for (n = n1; n <= n2; n++) {
- if (utc[n] <= 0) utc[n] = tstamp;
- if ((cd[n] != 0) && ((tstamp-utc[n]) > MAXREACH)) {
- for (i = FMAX-1; i >= 0; i--) {
- filtp[n][i] = 0.; fildp[n][i] = 0.;
- filep[n][i] = MAXDISP;
- }
- tp[n] = 0.; rp[n] = 0.;
- ep[n] = MAXDISP; utc[n] = 0.;
- cd[n] = 0;
- }
- }
- n = code & 0xff;
- if ((n < n1) || (n > n2)) continue;
- cd[n] = 0;
- if (code&C_SYNC == 0) continue;
- cd[n] = code; st[n] = (code&0x0f00)/256.;
- /*
- Update clock filter and statistics
-
- n = peer id, offset = clock offset, delay = roundtrip delay to root,
- disp = dispersion to root
- */
- t = tstamp-utc[n]; epsil = rho+phi*t+disp;
- filter(n, offset, delay, epsil);
- if (t > 0.) rp[n] = (tp[n]-u)/t;
- stat(n, tp[n]*1e3);
- /*
- Determine survivors and combine
-
- nclock = number of survivor clocks
- */
- i = dts(); if (i < nclock/2) continue;
- i = select(); if (i < 1) continue;
- nclock = i; theta = combine();
- if ((source != n) || (epsil+delta/2. >= maxdist))
- continue;
- /*
- Update local clock and statistics
- */
- dp[0] = delta; ep[0] = epsil; Vs = theta;
- mu = tstamp-utc[0]; utc[0] = tstamp; u = tp[0];
- x = mu; if (x > 1024.) x = 1024.;
- freq = freq+Vs*x/(tau*tau); phase = Vs/tau;
- tau = Ks-fabs(comp); if (tau < 1.) tau = 1.;
- comp = comp+(Kt*Vs-comp)*tau/Kh;
- i = mu/sigma; x = 1.-1./Kg; x = 1.-pow(x, i);
- tp[0] = tp[0]+x*phase+i/Kf*freq;
-
- printf("%8.1f%11.3f%10.4f%8.1f%8.1f%8.1f%5i%5i\n",
- tstamp/60., tp[0]*1e3, 1./(sigma*Kf)*freq*1e6,
- dp[0]*1e3, ep[0]*1e3, distance(0)*1e3, nclock,
- cd[source]&0xff);
- if (mu > 0.) rp[0] = (tp[0]-u)/mu;
- stat(0, tp[0]*1e3);
- }
- fclose(fp_in); printf("Summary\n"); display(); exit(0);
- }
- /*
- Subroutines
-
- Clock filter algorithm
-
- n = peer id, offset = sample offset, delay = sample delay,
- disp = sample dispersion; computes tp[n] = peer offset,
- dp[n] = peer delay, ep[n] = peer dispersion
- */
- void filter(int n, double offset, float delay, float disp) {
- int i, j, k, m; /* int temps */
- float x; /* float temps */
-
- for (i = FMAX-1; i > 0; i--) { /* update/shift filter */
- filtp[n][i] = filtp[n][i-1];
- fildp[n][i] = fildp[n][i-1];
- filep[n][i] = filep[n][i-1]+phi*(tstamp-utc[n]);
- }
- utc[n] = tstamp; filtp[n][0] = offset-tp[0];
- fildp[n][0] = delay; filep[n][0] = disp;
- m = 0; /* construct/sort temp list */
- for (i = 0; i < FMAX; i++) {
- if (filep[n][i] >= MAXDISP) continue;
- list[m] = filep[n][i]+fildp[n][i]/2.; index[m] = i;
- for (j = 0; j < m; j++) {
- if (list[j] > list[m]) {
- x = list[j]; k = index[j];
- list[j] = list[m]; index[j] = index[m];
- list[m] = x; index[m] = k;
- }
- }
- m = m+1;
- }
- if (m <= 0) /* compute filter dispersion */
- ep[n] = MAXDISP;
- else {
- ep[n] = 0;
- for (i = FMAX-1; i >= 0; i--) {
- if (i < m) {
- x = fabs(filtp[n][index[0]]-
- filtp[n][index[i]]);
- }
- else x = MAXDISP;
- ep[n] = FILTER*(ep[n]+x);
- }
- i = index[0]; ep[n] = ep[n]+filep[n][i];
- tp[n] = filtp[n][i]; dp[n] = fildp[n][i];
- }
- return;
- }
- /*
- Intersection algorithm
-
- computes nclock = candidate clocks, bot = lowpoint, top = highpoint;
- returns number of survivor clocks
- */
- int dts() {
- int f; /* intersection ceiling */
- int end; /* endpoint counter */
- int clk; /* falseticker counter */
- int i, j, k, n; /* int temps */
- float x, y; /* float temps */
-
- nclock = 0; i = 0;
- for (n = n1; n <= n2; n++) { /* construct endpoint list */
- if (ep[n] >= MAXDISP) continue;
- nclock++;
- list[i] = tp[n]-distance(n); index[i] = -1; /* lowpoint */
- for (j = 0; j < i; j++) {
- if ((list[j] > list[i]) ||
- ((list[j] == list[i]) &&
- (index[j] > index[i]))) {
- x = list[j]; k = index[j];
- list[j] = list[i]; index[j] = index[i];
- list[i] = x; index[i] = k;
- }
- }
- i = i+1;
- list[i] = tp[n]; index[i] = 0; /* midpoint */
- for (j = 0; j < i; j++) {
- if ((list[j] > list[i]) ||
- ((list[j] == list[i]) &&
- (index[j] > index[i]))) {
- x = list[j]; k = index[j];
- list[j] = list[i]; index[j] = index[i];
- list[i] = x; index[i] = k;
- }
- }
- i = i+1;
- list[i] = tp[n]+distance(n); index[i] = 1; /* highpoint */
- for (j = 0; j < i; j++) {
- if ((list[j] > list[i]) ||
- ((list[j] == list[i]) &&
- (index[j] > index[i]))) {
- x = list[j]; k = index[j];
- list[j] = list[i]; index[j] = index[i];
- list[i] = x; index[i] = k;
- }
- }
- i = i+1;
- }
- if (nclock <= 0) return 0;
- for (f = 0; ; f++) { /* find low endpoint */
- clk = 0; end = 0;
- for (j = 0; j < i; j++) {
- end = end-index[j]; bot = list[j];
- if (end >= (nclock-f)) break;
- if (index[j] == 0) clk = clk+1;
- }
- end = 0;
- for (j = i-1; j >= 0; j--) {
- end = end+index[j]; top = list[j];
- if (end >= (nclock-f)) break;
- if (index[j] == 0) clk = clk+1;
- }
- if (clk <= f) break;
- }
- return nclock-clk;
- }
- /*
- Selection algorithm
-
- bot = lowpoint, top = highpoint; computes nclock = number of
- candidate clocks, index = candidate index list, source = clock
- source, theta = clock offset, delta = roundtrip delay,
- epsil = dispersion; returns number of survivor clocks
- */
- int select() {
- float xi; /* max select dispersion */
- float eps; /* min peer dispersion */
- int i, j, k, m, n; /* int temps */
- float x, y, z; /* float temps */
-
- m = 0;
- for (n = n1; n <= n2; n++) { /* construct/sort candidate list */
- if ((st[n] > 0) && (st[n] < MAXSTRAT) &&
- (tp[n] >= bot) && (tp[n] <= top)) {
- list[m] = MAXDISP*st[n]+distance(n); index[m] = n;
- for (j = 0; j < m; j++) {
- if (list[j] > list[m]) {
- x = list[j]; k = index[j];
- list[j] = list[m];
- index[j] = index[m];
- list[m] = x; index[m] = k;
- }
- }
- m = m+1;
- }
- }
- nclock = m;
- if (m == 0) {
- source = 0; return m;
- }
- if (m > MAXCLOCK) m = MAXCLOCK;
- while (1) { /* cast out falsetickers */
- xi = 0.; eps = MAXDISP;
- for (j = 0; j < m; j++) {
- x = 0.;
- for (k = m-1; k >= 0; k--)
- x = SELECT*(x+fabs(tp[index[j]]-
- tp[index[k]]));
- if (x > xi) { /* max(xi) */
- xi = x; i = j;
- }
- x = ep[index[j]]+phi*(tstamp-utc[index[j]]);
- if (x < eps) eps = x; /* min(eps) */
- }
- if ((xi <= eps) || (m <= MINCLOCK)) break;
- if (index[i] == source) source = 0;
- for (j = i; j < m-1; j++) index[j] = index[j+1];
- m = m-1;
- }
- i = index[0]; /* declare winner */
- if (source != i)
- if (source == 0) source = i;
- else if (st[i] < st[source]) source = i;
- theta = tp[i]; delta = dp[i];
- epsil = ep[i]+phi*(tstamp-utc[i])+xi+fabs(tp[i]);
- return m;
- }
- /*
- Combining algorithm
-
- index = candidate index list, m = number of candidates;
- returns combined clock offset
- */
- float combine() {
- int i; /* int temps */
- float x, y, z; /* float temps */
-
- z = 0. ; y = 0.;
- for (i = 0; i < nclock; i++) { /* compute weighted offset */
- j = index[i]; x = distance(j);
- z = z+tp[j]/x; y = y+1./x;
- }
- return z/y; /* normalize */
- }
- /*
- Compute synch distance
-
- n = peer id; returns synchronization distance
- */
- float distance(int n) {
-
- return ep[n]+phi*(tstamp-utc[n])+dp[n]/2.;
- }
- /*
- Update statistics
- */
- void stat(int n, float u) {
-
- d0[n] = d0[n]+1; d1[n] = d1[n]+u; d2[n] = d2[n]+u*u;
- if (u > d3[n]) d3[n] = u; if (u < d4[n]) d4[n] = u;
- return;
- }
- /*
- Display statistics
- */
- void display() {
- int i; /* int temps */
-
- printf(" ID Samp Mean StdDev Max Min\n");
- for (i = 0; i < NMAX; i++) {
- if (d0[i] <= 1.) continue;
- v = sqrt(d2[i]/d0[i]-d1[i]/d0[i]*d1[i]/d0[i]);
- /* v = sqrt(d2[i]/(2*(d0[i]-1.))); */
- printf("%3i%8.0f%10.3f%10.3f%10.3f%10.3f\n",
- i, d0[i], d1[i]/d0[i], v, d3[i], d4[i]);
- /* fprintf(fp_out,
- "%3i%8.0f%10.3f%10.3f%12.3f%12.3f%10.3f\n",
- i, d0[i], d1[i]/d0[i], v, d3[i], d4[i]); */
- }
- }
- /*
- Reset statistics
- */
- void reset() {
- int i, j; /* int temps */
-
- freq = 0.; phase = 0.; comp = 0.; tau = 1.; source = 0;
- for (i = 0; i < NMAX; i++) {
- for (j = FMAX-1; j >= 0; j--) {
- filtp[i][j] = 0.; fildp[i][j] = 0.;
- filep[i][j] = MAXDISP;
- }
- tp[i] = 0.; rp[i] = 0.; ep[i] = MAXDISP;
- utc[i] = 0.; cd[i] = 0; st[i] = 0;
- d0[i] = 0.; d1[i] = 0.; d2[i] = 0.;
- d3[i] = -1e10; d4[i] = 1e10;
- }
- }
-